Search CORE

81 research outputs found

The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection

Author: A Rokas
AD Leaché
B Carstens
B Holland
B Rannala
C Ané
C Ané
C Ané
C Meng
C Than
C Than
C Than
CR Linder
CV Than
D Huson
D Posada
D Ruths
D Swofford
DA Pollard
DL Swofford
ES Allman
ES Allman
EW Bloomquist
G Schwarz
H Akaike
H Huang
H Lanier
J Heled
J Mallet
J Mallet
J Wakeley
James H. Degnan
JH Degnan
JH Degnan
JH Degnan
JJ Doyle
Joseph Felsenstein
JP Huelsenbeck
K Burnham
L Liu
L Liu
L Nakhleh
LL Knowles
LS Kubatko
LS Kubatko
LS Kubatko
Luay Nakhleh
M DeGiorgio
M Nei
M Slatkin
ML Arnold
NA Rosenberg
SM Ross
SV Edwards
SV Edwards
TC Bruen
W Maddison
Y Wang
Y Wu
Y Yu
Yun Yu
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa

Public Library of Science (PLOS)

Crossref

UC Research Repository

Directory of Open Access Journals

PubMed Central

DSpace at Rice University

FigShare

Coalescent-based genome analyses resolve the early branches of the euarchontoglires

Despite numerous large-scale phylogenomic studies, certain parts of the mammalian tree are extraordinarily difficult to resolve. We used the coding regions from 19 completely sequenced genomes to study the relationships within the super-clade Euarchontoglires (Primates, Rodentia, Lagomorpha, Dermoptera and Scandentia) because the placement of Scandentia within this clade is controversial. The difficulty in resolving this issue is due to the short time spans between the early divergences of Euarchontoglires, which may cause incongruent gene trees. The conflict in the data can be depicted by network analyses and the contentious relationships are best reconstructed by coalescent-based analyses. This method is expected to be superior to analyses of concatenated data in reconstructing a species tree from numerous gene trees. The total concatenated dataset used to study the relationships in this group comprises 5,875 protein-coding genes (9,799,170 nucleotides) from all orders except Dermoptera (flying lemurs). Reconstruction of the species tree from 1,006 gene trees using coalescent models placed Scandentia as sister group to the primates, which is in agreement with maximum likelihood analyses of concatenated nucleotide sequence data. Additionally, both analytical approaches favoured the Tarsier to be sister taxon to Anthropoidea, thus belonging to the Haplorrhine clade. When divergence times are short such as in radiations over periods of a few million years, even genome scale analyses struggle to resolve phylogenetic relationships. On these short branches processes such as incomplete lineage sorting and possibly hybridization occur and make it preferable to base phylogenomic analyses on coalescent methods

Crossref

Directory of Open Access Journals

PubMed Central

Hochschulschriftenserver - Universität Frankfurt am Main

Inference of population splits and mixtures from genome-wide allele frequency data

Author: A Keinan
A RoyChoudhury
AL Price
AR Boyko
BM Henn
BM vonHoldt
BS Weir
C Becquet
D Reich
D Reich
D Reich
DH Huson
DJ Lawson
EY Durand
G Bhatia
G Coop
G Hellenthal
G Liti
G McVean
G Nicholson
GM Lathrop
HG Parker
Hua Tang
I Gronau
J Felsenstein
J Felsenstein
J Felsenstein
J Hey
J Novembre
J Novembre
J Novembre
J Sirén
J Sukumaran
JK Pritchard
JK Pritchard
Jonathan K. Pritchard
Joseph K. Pickrell
JZ Li
K Lindblad-Toh
LL Cavalli-Sforza
LL Cavalli-Sforza
LL Cavalli-Sforza
LL Cavalli-Sforza
LS Kubatko
M Bonhomme
M DeGiorgio
M Jakobsson
M Nei
M Nei
M Rasmussen
MA Beaumont
N Patterson
N Patterson
N Saitou
NA Rosenberg
O François
P Beerli
P Menozzi
P Moorjani
R Nielsen
RE Green
RJ Dyer
RL Cann
RN Gutenkunst
RR Hudson
S Xu
SF Schaffner
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In this model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication, and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and "ancient" Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.comComment: 28 pages, 6 figures in main text. Attached supplement is 22 pages, 15 figures. This is an updated version of the preprint available at http://precedings.nature.com/documents/6956/version/

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

FigShare

Molecular Systematics of the Deep-Sea Hydrothermal Vent Endemic Brachyuran Family Bythograeidae: A Comparison of Three Bayesian Species Tree Methods

Author: A Graybeal
A Rambaut
A Stamatakis
A Stamatakis
A Stamatakis
A Sáez
AB Williams
AD Leaché
AI Dittel
BGM Jamieson
BR Larget
C Ané
C Ané
C Bachraty
C McLay
C Than
C Than
Carlos A. Santamaria
CC Tudge
CL Van Dover
D Guinot
D Guinot
D Guinot
D Guinot
D Guinot
D Guinot
D Guinot
D Guinot
D Guinot
D Posada
Danièle Guinot
DJ Colgar
DJ Zwickl
DR Maddison
F Micheli
F Ronquist
FE Anderson
G Talavera
H Karasawa
J Castresana
J Heled
J Sukumaran
J-S Yang
JAA Nylander
JCY Lai
JD Thompson
JH Degnan
JP Huelsenbeck
JR Voight
JW Martin
JY Lee
KA Cranston
L Kubatko
L Liu
L Liu
L Liu
LA Gorodezky
LA Hurtado
LL Knowles
LM Tsang
LS Kubatko
LS Kubatko
Luis A. Hurtado
M de Saint Laurent
M Pagel
M Segonzac
MA Newton
MA Suchard
Mariana Mateos
MF Whiting
MF Whiting
NM Belfiore
O Folmer
P Chevaldonné
PKL Ng
RD Vetter
RE Kass
RR Hessler
RV Sternberg
S Tsuchida
S Tsuchida
Sharyn Jane Goldstien
SR Palumbi
SV Edwards
T Atwater
TJ Mickel
TJS Merritt
TM Shank
V Tunnicliffe
Vincent Leignel
WP Maddison
WS Moore
Y Chung
Y Won
ZH Yang
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Brachyuran crabs of the family Bythograeidae are endemic to deep-sea hydrothermal vents and represent one of the most successful groups of macroinvertebrates that have colonized this extreme environment. Occurring worldwide, the family includes six genera (Allograea, Austinograea, Bythograea, Cyanagraea, Gandalfus, and Segonzacia) and fourteen formally described species. To investigate their evolutionary relationships, we conducted Maximum Likelihood and Bayesian molecular phylogenetic analyses, based on DNA sequences from fragments of three mitochondrial genes (16S rDNA, Cytochrome oxidase I, and Cytochrome b) and three nuclear genes (28S rDNA, the sodium–potassium ATPase a-subunit ‘NaK’, and Histone H3A). We employed traditional concatenated (i.e., supermatrix) phylogenetic methods, as well as three recently developed Bayesian multilocus methods aimed at inferring species trees from potentially discordant gene trees. We found strong support for two main clades within Bythograeidae: one comprising the members of the genus Bythograea; and the other comprising the remaining genera. Relationships within each of these two clades were partially resolved. We compare our results with an earlier hypothesis on the phylogenetic relationships among bythograeid genera based on morphology. We also discuss the biogeography of the family in the light of our results. Our species tree analyses reveal differences in how each of the three methods weighs conflicting phylogenetic signal from different gene partitions and how limits on the number of outgroup taxa may affect the results

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Texas A&M Repository

FigShare

Species Tree Estimation for the Late Blight Pathogen, Phytophthora infestans, and Close Relatives

Author: A Drummond
A Haverkort
A McDonald
A Rambaut
A Rambaut
A Riethmuller
A Rokas
AD Leache
AG Clark
AG Clark
BJ Haas
BL Gross
BR Larget
C Ane
C Ane
C Meng
CA Buerkle
CL Schardl
CM Brasier
CM Brasier
CM Brasier
D Cooke
D Posada
DA Baum
DC Erwin
DD Ence
DJ Zwickl
DL Swofford
DM Spooner
EM Goss
F Jacobsen
F Ronquist
FN Martin
FN Martin
FN Martin
FN Martin
Frank N. Martin
G Kimmel
GA Forbes
IE Peralta
J Felsenstein
J Galindo
J Heled
J Mallet
J Moskin
J Rozas
J-M Moncalvo
Jaime E. Blair
JC Zadoks
JD Thompson
JE Blair
JE Richardson
JH Degnan
JS Niederhauser
JT Weir
K Tamura
KA Cranston
Keith A. Crandall
L Eronen
L Excoffier
L Gomez-Alpizar
L Gomez-Alpizar
L Liu
L Liu
L Liu
LPNM Kroon
LS Kubatko
LS Kubatko
LS Kubatko
M Balke
M Cárdenas
M Krings
M Stephens
ME Ordonez
Michael D. Coffey
ML Serrano-Serrano
NA Douglas
NE Adler
NJ Grunwald
OP Hurtado-Gonzales
P Pamilo
PJM Bonants
R Ioos
R Ioos
R Ioos
R Pennington
RF Oliva
RG Olmstead
S Knapp
S Raffaele
S Savary
SF Altschul
SV Edwards
SV Edwards
T White
WA Man in 't Veld
WA Man in 't Veld
WE Fry
WG Flier
WP Maddison
Y Chen
Y Yu
Z-S Huang
Z-S Huang
ZG Abad
Publication venue: Public Library of Science
Publication date: 17/05/2012
Field of study

To better understand the evolutionary history of a group of organisms, an accurate estimate of the species phylogeny must be known. Traditionally, gene trees have served as a proxy for the species tree, although it was acknowledged early on that these trees represented different evolutionary processes. Discordances among gene trees and between the gene trees and the species tree are also expected in closely related species that have rapidly diverged, due to processes such as the incomplete sorting of ancestral polymorphisms. Recently, methods have been developed for the explicit estimation of species trees, using information from multilocus gene trees while accommodating heterogeneity among them. Here we have used three distinct approaches to estimate the species tree for five Phytophthora pathogens, including P. infestans, the causal agent of late blight disease in potato and tomato. Our concatenation-based “supergene” approach was unable to resolve relationships even with data from both the nuclear and mitochondrial genomes, and from multiple isolates per species. Our multispecies coalescent approach using both Bayesian and maximum likelihood methods was able to estimate a moderately supported species tree showing a close relationship among P. infestans, P. andina, and P. ipomoeae. The topology of the species tree was also identical to the dominant phylogenetic history estimated in our third approach, Bayesian concordance analysis. Our results support previous suggestions that P. andina is a hybrid species, with P. infestans representing one parental lineage. The other parental lineage is not known, but represents an independent evolutionary lineage more closely related to P. ipomoeae. While all five species likely originated in the New World, further study is needed to determine when and under what conditions this hybridization event may have occurred

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Genetic Networking of the Bemisia tabaci Cryptic Species Complex Reveals Pattern of Biological Invasions

Author: A Dinsdale
A Estoup
A Rodrigo
AK Sakai
AR Templeton
Bandelt
Barry L. Williams
BC O'Meara
C Lejeusne
D Kelly
D Tautz
DD Ence
DN Hebert P
DW Crowder
DW Crowder
E Diday
EM Pilgrim
G Gentile
H Chen
HJ Bandelt
HJ Bandelt
HJ Bandelt
J Cracraft
J Heled
J Hu
J Pons
J Xu
JC Avise
JD Thompson
JJ Wiens
JN Seal
JP Wares
JW Sites
K de Queiroz
K Strimmer
K Tamura
L Excoffier
L Excoffier
L Excoffier
L Liu
LM Boykin
LR Foulds
LS Kubatko
LS Kubatko
M Clement
M Eigen
M Elbaz
M Kimura
MP Cummings
MT Monaghan
Muhammad Z. Ahmed
MW Hart
MZ Ahmed
NA Rosenberg
NC Ellestrand
ND Tsutsui
P Lee
P Wang
P Wang
Paul De Barro
PD Hebert
PD Hebert
PJ De Barro
R Dalton
R Xavier
S Cheek
S Xu
SJ Goldstien
SJ Novak
V Makarenkov
WM Fitch
Z Abdo
Z Goldstein P
Z Yang
Publication venue: Public Library of Science
Publication date: 03/10/2011
Field of study

BACKGROUND: A challenge within the context of cryptic species is the delimitation of individual species within the complex. Statistical parsimony network analytics offers the opportunity to explore limits in situations where there are insufficient species-specific morphological characters to separate taxa. The results also enable us to explore the spread in taxa that have invaded globally. METHODOLOGY/PRINCIPAL FINDINGS: Using a 657 bp portion of mitochondrial cytochrome oxidase 1 from 352 unique haplotypes belonging to the Bemisia tabaci cryptic species complex, the analysis revealed 28 networks plus 7 unconnected individual haplotypes. Of the networks, 24 corresponded to the putative species identified using the rule set devised by Dinsdale et al. (2010). Only two species proposed in Dinsdale et al. (2010) departed substantially from the structure suggested by the analysis. The analysis of the two invasive members of the complex, Mediterranean (MED) and Middle East - Asia Minor 1 (MEAM1), showed that in both cases only a small number of haplotypes represent the majority that have spread beyond the home range; one MEAM1 and three MED haplotypes account for >80% of the GenBank records. Israel is a possible source of the globally invasive MEAM1 whereas MED has two possible sources. The first is the eastern Mediterranean which has invaded only the USA, primarily Florida and to a lesser extent California. The second are western Mediterranean haplotypes that have spread to the USA, Asia and South America. The structure for MED supports two home range distributions, a Sub-Saharan range and a Mediterranean range. The MEAM1 network supports the Middle East - Asia Minor region. CONCLUSION/SIGNIFICANCE: The network analyses show a high level of congruence with the species identified in a previous phylogenetic analysis. The analysis of the two globally invasive members of the complex support the view that global invasion often involve very small portions of the available genetic diversity

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Nuclear versus mitochondrial DNA: evidence for hybridization in colobine monkeys

Author: A Rambaut
A Rambaut
AE Lebatard
AE Pusey
AG Davies
AH Salem
AH Salem
AJ Drummond
AJ Drummond
AJ Drummond
AJ Tosi
AM Shedlock
B Hallet
BR Benefit
C Darwin
C Roos
CB Stewart
CB Stewart
Christian Roos
Christiane Schwarz
CP Groves
CP Groves
D Brandon-Jones
D Chakraborty
D Funk
D Posada
D Zinner
D Zinner
DA Pollard
DA Ray
DA Ray
Dietmar Zinner
Dirk Meyer
DJ Zwickl
DL Swofford
Dyah Perwitasari-Farajallah
E Delson
E Mayr
E Meijaard
E Strasser
F Ronquist
Fabian H Leendertz
FS Szalay
H Kishino
H Philippe
H Shimodaira
IS Zalmout
J Castresana
J Kelley
J Li
J Schmitz
J Schmitz
J Xing
J Xing
J Xing
JC Avise
Jinchuan Xing
JP Huelsenbeck
JR Napier
K Katoh
K McCracken
KG Miller
KN Sterner
KP Burnham
KP Karanth
L Cortés-Ortiz
L Hellborg
Laura S Kubatko
LN Van de Lagemaat
LS Kubatko
LS Kubatko
LS Whitfield
Lutz Walter
M Brunet
M Goodman
M Osterholz
M Osterholz
MA Batzer
Mark A Batzer
Markus Brameier
Martin Osterholz
MG Leakey
ML Arnold
Mouyu Yang
N Okada
N Patterson
N Ting
N Ting
NG Jablonski
NG Jablonski
NH Barton
NM Young
O Seehausen
O Thalmann
P Vignaud
PJ Whybrow
R Chaves
R Nichols
RE Green
RJ Petit
RL Raaum
S Koblmüller
S Merker
S Pääbo
SK Wyman
Stephen D Nash
SW Herke
Thomas Ziegler
Tilo Nadler
VN Thinh
WP Maddison
Y Rumpler
YP Zhang
YZ Peng
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Colobine monkeys constitute a diverse group of primates with major radiations in Africa and Asia. However, phylogenetic relationships among genera are under debate, and recent molecular studies with incomplete taxon-sampling revealed discordant gene trees. To solve the evolutionary history of colobine genera and to determine causes for possible gene tree incongruences, we combined presence/absence analysis of mobile elements with autosomal, X chromosomal, Y chromosomal and mitochondrial sequence data from all recognized colobine genera. Results Gene tree topologies and divergence age estimates derived from different markers were similar, but differed in placing <it>Piliocolobus/Procolobus </it>and langur genera among colobines. Although insufficient data, homoplasy and incomplete lineage sorting might all have contributed to the discordance among gene trees, hybridization is favored as the main cause of the observed discordance. We propose that African colobines are paraphyletic, but might later have experienced female introgression from <it>Piliocolobus</it>/<it>Procolobus </it>into <it>Colobus</it>. In the late Miocene, colobines invaded Eurasia and diversified into several lineages. Among Asian colobines, <it>Semnopithecus </it>diverged first, indicating langur paraphyly. However, unidirectional gene flow from <it>Semnopithecus </it>into <it>Trachypithecus </it>via male introgression followed by nuclear swamping might have occurred until the earliest Pleistocene. Conclusions Overall, our study provides the most comprehensive view on colobine evolution to date and emphasizes that analyses of various molecular markers, such as mobile elements and sequence data from multiple loci, are crucial to better understand evolutionary relationships and to trace hybridization events. Our results also suggest that sex-specific dispersal patterns, promoted by a respective social organization of the species involved, can result in different hybridization scenarios.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Louisiana State University

Publikationsserver des Robert Koch-Instituts

A polynomial time algorithm for calculating the probability of a ranked gene tree given a species tree

Author: AJ Drummond
AWF Edwards
B Schiever
C Semple
C Than
C Than
C Than
D Harel
E Mossel
ES Allman
GB Ewing
H Huang
J Heled
J Wakeley
James H Degnan
JH Degnan
JH Degnan
JH Degnan
JH Degnan
JH Degnan
L Liu
L Liu
L Liu
L Liu
L Liu
LS Kubatko
NA Rosenberg
NA Rosenberg
P Pamilo
S Ross
S Tavaré
Tanja Stadler
WP Maddison
Y Wang
Y Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria

Author: A Luo
A Rambaut
A Rambaut
A Rokas
C Andam
C Delwiche
C Kurland
C Linder
C Meng
C Than
CH Kuo
D Posada
D Posada
D Posada
DA Pollard
DR Robinson
G Jin
G Jin
GE Schwarz
H Akaike
H Ochman
H Park
H Park
Hyun Jung Park
J Felsenstein
J Mallet
J Mallet
J Mower
J Syring
JH Degnan
L Rieseberg
L Rieseberg
LS Kubatko
Luay Nakhleh
M Arnold
M McClilland
M Noor
N Ellstrand
N Galtier
R Welch
U Bergthorsson
U Bergthorsson
W Doolittle
W Doolittle
W Hao
WP Maddison
X Didelot
X Didelot
X Didelot
Y Nakamura
Y Yu
Y Yu
Publication venue
Publication date: 01/01/2012
Field of study

Background: Maximum likelihood has been widely used for over three decades to infer phylogenetic trees from molecular data. When reticulate evolutionary events occur, several genomic regions may have conflicting evolutionary histories, and a phylogenetic network may provide a more adequate model for representing the evolutionary history of the genomes or species. A maximum likelihood (ML) model has been proposed for this case and accounts for both mutation within a genomic region and reticulation across the regions. However, the performance of this model in terms of inferring information about reticulate evolution and properties that affect this performance have not been studied. Results: In this paper, we study the effect of the evolutionary diameter and height of a reticulation event on its identifiability under ML. We find both of them, particularly the diameter, have a significant effect. Further, we find that the number of genes (which can be generalized to the concept of "non-recombining genomic regions") that are transferred across a reticulation edge affects its detectability. Last but not least, a fundamental challenge with phylogenetic networks is that they allow an arbitrary level of complexity, giving rise to the model selection problem. We investigate the performance of two information criteria, the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), for addressing this problem. We find that BIC performs well in general for controlling the model complexity and preventing ML from grossly overestimating the number of reticulation events. Conclusion: Our results demonstrate that BIC provides a good framework for inferring reticulate evolutionary histories. Nevertheless, the results call for caution when interpreting the accuracy of the inference particularly for data sets with particular evolutionary features

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

DSpace at Rice University

Inferring Phylogenies from RAD Sequence Data

Author: A Drummond
A Rokas
A Stamatakis
AB Prasad
AD Cutter
AMC Russo
B Rannala
Benjamin E. R. Rubin
C Ané
Corrie S. Moreau
D Bryant
DA Pollard
DM Hillis
EP de Villiers
F Wu
GE Sims
GV Glazko
H Li
H Philippe
H Philippe
H Philippe
J Bergsten
J Catchen
J Felsenstein
JG Burleigh
JJ Weins
JJ Weins
JT Foster
JW Taylor
K Tamura
KJ Emerson
L Liu
LL Knowles
LS Kubatko
ML Metzker
MR Miller
MR Miller
NA Baird
PA Hohenlohe
RC Edgar
RC Edgar
Richard H. Ree
RM Adkins
S Steppan
Sergios-Orestis Kolokotronis
TM Fulton
WJ Murphy
ZA Lewis
Publication venue: Public Library of Science
Publication date: 06/04/2012
Field of study

Reduced-representation genome sequencing represents a new source of data for systematics, and its potential utility in interspecific phylogeny reconstruction has not yet been explored. One approach that seems especially promising is the use of inexpensive short-read technologies (e.g., Illumina, SOLiD) to sequence restriction-site associated DNA (RAD) – the regions of the genome that flank the recognition sites of restriction enzymes. In this study, we simulated the collection of RAD sequences from sequenced genomes of different taxa (Drosophila, mammals, and yeasts) and developed a proof-of-concept workflow to test whether informative data could be extracted and used to accurately reconstruct “known” phylogenies of species within each group. The workflow consists of three basic steps: first, sequences are clustered by similarity to estimate orthology; second, clusters are filtered by taxonomic coverage; and third, they are aligned and concatenated for “total evidence” phylogenetic analysis. We evaluated the performance of clustering and filtering parameters by comparing the resulting topologies with well-supported reference trees and we were able to identify conditions under which the reference tree was inferred with high support. For Drosophila, whole genome alignments allowed us to directly evaluate which parameters most consistently recovered orthologous sequences. For the parameter ranges explored, we recovered the best results at the low ends of sequence similarity and taxonomic representation of loci; these generated the largest supermatrices with the highest proportion of missing data. Applications of the method to mammals and yeasts were less successful, which we suggest may be due partly to their much deeper evolutionary divergence times compared to Drosophila (crown ages of approximately 100 and 300 versus 60 Mya, respectively). RAD sequences thus appear to hold promise for reconstructing phylogenetic relationships in younger clades in which sufficient numbers of orthologous restriction sites are retained across species

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare